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ABSTRACT 


The use of the "degrees of freedom for signal" is proposed as a 
design criteria for comparing different designs for satellite and other 
measuring systems. It is also proposed that certain eigensequence plots 
be examined at the design stage along with appropriate estimates of the 
parameter X playing the role of noise to signal ratio. The degrees of 
freedom for signal and the eigensequence plots may be determined using prior 
information in the spectral domain which is presently available along 
with a description of the system, and simulated data for estimating X. This 
work extends the 1972 work of Weinreb and Crosby. 
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INTRODUCTIOll 


Recently Fleming (1983) has suggested that Improved temperature 
retrieval « from satellite soundings may be obtained by use of data from a 
sensor whi'.h scans forward and back along the satellite track, and thus 
''looks*' at a particular point in space from several directions as well as 
directly down. This idea was suggested by analogy with well known results 
from computed tomography techniques in use in medicine, Fleming constructed a 
model temperature field and simulated noisy data from three different ray 
configurations, one looking straight down only, one having in addition one 
forward and one rearward angle, and the third having two forward and two 
rearward angles. See Fig. 1. He then recovered the model temperatures on a 
two dimensional grid with one axis vertical and one axis along the satellite 
track, by a numerically efficient iterative procedure for solving large linear 
systems. He performed the necessary regularization in this ill posed problem 
by stopping the iteration. See also Fleming (1977) , Wahba (1980). Similar 
methods are common in medical applications. Fleming's results in the example 
tried were: two additional angles are better than straight down only, and 

four are better than two, from the point of view of mean square error. 

We are interested in the problem of choice of angles, spacing of 
observations, selection of channels and other questions concerning the design 
of measuring systems. Weinreb and Crosby (1972) discussed design criteria 
which can be used to make an evaluation of alternative satellite designs and 
they applied these criteria to the selection of radiometer ^nels. In this 
paper, we begin with what is essentially the design criteria ^'“oposed by 
Weinreb and Crosbv. However, we propose using prior information concerning 
meteorological fields in the frequency or spectral domain, rather than the 
spatial domain, leading to details which can be different. This approach uses 
information which is available at the present time (but not in 1972!) and is 
particularly appropriate for the evaluation and comparison of potential 
satellite systems that simultaneously use three dimensional information, as 
well as t’le evaluation of systems which use combined satellite and radiosonde 
data. Implicit in the procedures described here is an algorithm for combining 
satellite and radiosonde data. Our approach also makes clear the role of 
possibly variable bandwidth parameter (s) in system design, a point which has 
traditionally been ignored. In Section 2 we derive the design criteria in our 
form (as opposed to the form used by Weinreb and Crosby) and also note how 
data from different systems can be combined. In Section 3 we describe the 
idea of the "effective rank" of a system, which is roughly equivalent to the 
"degrees of freedom for signal" associated with a design. The "degrees of 
freedom for signal" is related to but not exactly the same as one of the 
criteria used by Weinreb and Crosby, and is analogous to the usual degrees of 
freedom for signal in analysis of variance. We suggest the use of 
eigensequence plots along with the GCV (generalized cross validation) estimate 
of the bandwidth (or signal to noise ratio) parameter on these plots, to 
evaluate and compare different systems, from the point of view of degrees of 
freedom for signal. 
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The details of the approach descibed here are perfectly general and can 
be used to make a preliminary evaluation of combinations of measuring systems 
for example the use of several satellites simultaneously, and the combined 
use of direct and indirect measurements. 


2. DESIGN CRITERIA 

For simplicity we will make assumptions similar to those made by Fleming 
that is, that surface temperature is known accurately, an initial (either 
first guess or climatology) value Tq of the temperature field is known and 
that it is adequate^ to linearize the Planck function about Tq. With these 
approximations, given a particular design, the data may be modelled as 
follows, after subtracting out the mean; 


Vk.e.v =• I ke v(x-Xk.P) T6(x,p)dxdp + e|{ o v (1) 

ray(e.k) 

where s is the nadir angle, v is the central frequency of the spectral window 
p is pressure, x is the distance along the subsatellite track, xi< the kth 
subsatellite point and K represents the instrumental spectral response- 
function convolved with the atmospheric transmittance along a ray with nadir 
angle 9. The integralis along the ray with subsatellite point x^ and nadir 
angle 9. Refer to Fig. 1. Here T*5(x,p) = T(x,p) -To(x,p) and the e|^,e,v 
represent measurement, quadrature, and modellinjg errors. See, e.g. Warl< and 
Fleming (1966), Fritz et al . (1972). We shall assume that the observations 
have been normalized so that Ee\,e,v is roughly constant and the ei<,e,v are 
roughly independent. 

Next, we shall assume that T*^ possess a (generalized) Fourier series 
expansion in some appropriate basis functions in x and p, for example: 


T«(x,p) = I Ta,-, (x)t^(p). (2) 

a,Y 

If the temperature is going to be retrieved around a circle, it may be 
appropriate to let the ta be sines and cosines, the K are appropriate 
(continuous) orthogonal functions in the vertical. If one was carrying out 
this study on the globe, spherical harmonics might be appropriate. In 
general, the {il'a(x) ^ conveniently taken to be orthonormal over 

an appropriate region. In other contexts Hough functions might be used. See 
Wahba (1982a). 


J/^or a more careful approach to the nonlinearity, the linearization in 
O^Sullivan (1983) p. 78 may be used. 
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The observations are now modelled as: 

yk,9,v = I Ta,Y / , ^ Ke,v(x-X|c,p) T^(x,p) + e v - (3) 

o,Y ray(6,k) 

which we can rewrite as 


y = Xf}+e (4) 

where y is the (rearranged) vector of the observations y^ e v> ^ is the 
(rearranged)' vector of the Tq y's and e is the rearranged ‘vector of the 
®k,6,v* Letting i stand for ^,e,vand ,i stand for a,y we have that the i,jth 
entry of X is 


Xij = / , Ke v(x-Xk,p)i|^a(x)<))Y(p)dxdp. (5) 

ray( e,k) 

Letting Sj = Tq y, if Tq has been obtained from climatology and the s and 
<j)‘s have been chosen appropriately, a fair amount of information may be 
constructed or assumed concerning the prior distribution of the 3j's. See, 
for example, Baer (1980), Stanford (1979), Kasahara and Puri (198i), Smith 
and Woolf (1976). An illustration of the explicit use of Stanford's results 
in this context can be found in Wahba (1982b). We shall suppose that the 5j's 
have a prior mean of zero, and a prior covariance matrix given by 


E0j $1^ = b<^jk» ^ 

In the sequel we will be assuming that ajk is known, but the scale factor b 
may not be. We suppose that the errors can be modelled (approximately) as 
independent Gaussian random variables with a common (possibly unknown) 
variance Then a regularized estimate of p is S>\ given by the minimizer of 


1 

- I |y - XB| |2 + Xb' (6) 

n 

The minimizer, 6x is given by 


Bx = SX‘ (XZX' + nXI)“ly 


(7) 
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and the temperature estimate is To(x,p) + Tx'^(x,p) where 


Tx®(x,p) » E Tx,a,Y 

a,ir 

and the Tx,a y components of This can also be shown to be the 

Bayes estimate (Gandin estimate) of 3 with the choice X = a^/nb. That is, 3x 
is the conditional expectation of 3 given the data. This result is found in a 
more general setting in Kimeldorf and Wahba (1971), see also Wahba (1978a). 

In practice the estimate can be extremely sensitive to the choice of A and not 
"robust" to misspecification of a^/nb or other modelling assumptions'^ and so X 
should be chosen either from experience ("by eyeball") or by a good data based 
method such as generalized cross validation (GCV) (see e.g. Craven and Wahba 
(1979), Golub, Heath and Wahba (1979) Halem and Kalnay (1983), Wahba and 
Wendelberger (1980). We will, for the moment, however, leave X as a 
parameter. Now, suppose our criteria for preferring one design over another 
is to minimize the expected integrated mean square error, (IMSE) where 


IMSE = / (Tx'S(x,p) - T«(x,p))^dxdp. (0) 

area of 
interest 

Expanding (8) in the gives 


IMSE = E ^3j - 6>.,j)qjk ( " 3x,k) (9) 

where, if j = (o,y) and k = (cx',y'), then 


djk = / 'ka(x)'i>Y(p)'J'a'(x)<t>Y'(P)^^^P» J (<=^»t), k = (aV,Y') 

area of 
interest 

We now take the expected value of (9), over both the distribution of the 3j 
and the cj. Substitution of 

3x = ex' (XEX* + nXI)“l(X3+e) 
into (9) gives 


^ee Appendix B 
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E IMSE{X) = E{s'Mi'qMiB + 20'Mi'QM 2E + e'M2'QM2e) (10) 

where Q is the matrix with jkth entry qjk and 
Ml = I - 2X'(XJ:X' + nxi)-ix 

M2 = 2X'(XEX' + nXI)“^ 

Carrying out, the expectation operation in (10), after assuming that EsiSj = 0 
gives 


E IMSF(X,X) = Trace {bMi'QMiZ + a2M2'QM2} (11) 

Letting and eV 2 be the symmetric square roots of Q and S, it is shown in 
A that rearranging (11) results in 


1 

~IMSE(X,X) a Trace qV2{e - ex'(XEX'. + nXI)-i XZ}Qi/2 
b 

+ ( nX)Trace qV2{j;x' (XEX'+ nXl)-2XE}Q^/2 ( 12 ) 

b 

It can be shown that the right hand side of (12) is minimized over X for 
nX = a2/b. Making this choice for X gives 


1 a2 

- IMSE(X) = Trace qV2 {e - r.x'(XZX' + — I)-i XT.) qV2 ( 13 ) 

b b 

Typically it will be possible to choose the ^ diagonal 

(Usually, information about cross covariances is not readily available 
anyway.) If the area of interest and the area over which the are 

orthonomal coincide, then Q will be diagonal, thus making (13) more 
transparent. In any case, we want to choose X so that the right hand side of 
(13) is as small. as possible. We have the following 

Theorem : ,Let Xi and )^2 ^>6 two design matrices of the same dimension and 
suppose that 0 Xi Xi0 > 0 X2*^2^ ^ll 0, (That is, Xi Xi - X 2 *X 2 is non 
negative definite). 
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Then 

IMSE(Xi) < IMSE(X2) , (14) 

for any non negative definite Q and — > 0. 

b 


Proof; See Appendix A 

Unfortunately this provides only a partial ordering. We would like to find a 
more graphic way of evaluating a design, or comparing two designs, independent 
of Q, We will do this in the next Section. 

We remark that if radiosonde information is to be combined with satellite 
information, then one just increases the dimension of the data vector y in 
(4)» If yk 1s a direct measurement of temperature at a point’ (x|- ,Pk) then 
this just adds a row to the X matrix with entries x^j « ’J^o(xk) 'i'-Y^Pk)* 

If different measuring systems are being combined it is appropriate to scale 
the observations in units chosen so that the are about the same size. 


3. EIGENSEQUENCE PLOTS, EFFECTIVE RANK, AND DEGREES OF FREEDOM FOR SIGNAL. 

Letting the dimension of X be nxp, we have not discussed the relative 
size of n and p. In meteorological work it is frequently reasonable that 
p > n, since meteorological fields cnruin information at all scales. 

Certainly in the design phase one should allow p to be as large as 
computationally feasible consistent with the availability of (measured, 
theoretical, or conjectured) prior variances. 'One does not expect to get very 
good estimates of individual with p > n, however, it is T'^(x,p) that is 
actually desired and good estimates of T^(x,p) may be obtainable even though 
some of the individual coefficient estimates appear poor. Inspection of (7) 
shows that the number of linearly independent, pieces of information in y 
available for estimating 0 (and hence T^) is limited by the number of 
eigenvalues of XrX which are at least not negligible compared to nx2/. The 
"signal" along an eigenvector with eigenvalue much less than nX will be down 
in the "noise". Proceeding under the assumption that n < p, it is typical 
nevertheless, in ill po^ed problems, that the "effective rank" of matrices 
playing the ij'ole of XZX is much less than n, when n is large. The "effective 
rank" of XZX can be roughly defined as the number of eigenvalues of XZX not 
small compared to the noise (relative to b) in the system. (See Wahba 
(1980)). This "noise" in practice includes not only the measurement error, 
but the errors in modelling the atmospheric transmittance functions, in 
linearizing Planck's function, and in computing^the integrals in (5), using 
quadrature formulae. The effective rank of XzX' can easily be studied by 
plotting the eigenvalues of XZX on a log-log plot. 


-2(»/here nX is appropriately chosen, see appendix B. 
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Figure 2 gives an eigensequence pie!' of the eigenvalues reprinted from 
Nychka> Wahba, Goldfarb and Pugh (1983) (NWGP). The problem in NWPG is a 
mildly ill posed problem concerned with the recovery of three dimensional 
tumor size distributions from tumor radii observed from two dimensional 
slices. This is a tomographic problem of a somewhat different form than the 
one under study. Nevertheless, there are some common problems. There were n 
a 80 observations, 68 of the 80 eigenvalues appear on this plot. The 
precipitous drop off of the last few eigenvalues has been attributed to 
artifacts of the quadrature procedure. Data from an active experiment using 

the design behind this plot was actually analyzed and nX estimated by GCV 

appears on the figure. In practice n^ would appear instead of nX in (7), 

where, in the design phase, X would be obtained by simulating realistic 
examples. One can see that there are only 6 eigenvalues at least as large 


as nX. Strictly speaking, comparing the eigensequence plots for XjSXi and 

X 2 ZX 2 ' does not necessarily provide enough information for choosing between Xi 
and A 2 on the basis of criteria (13), nevertheless, these plots can be quite 
informative. 


A measure of comparison between Xj and X 2 which depends only on the 
respective eigenvalues and X is the "degrees of freedom .for signal."- We may 
define d.f. signal (X,X*) as 

d.f, signal (X,X*) <= trace(XEX' )(XEX' + nX*!)-^^ 

n X\) 

= I 

'^l XvfnX* 

where X^, v=l,2,...n are the eigenvalues of XSX', and X* is a good choice of 
X. To understand this definition, which is analogous to similar definitions 
in analysis of variance, observe that y can be decomposM into signal and 
noise as follows 


y = yx* + ex* 


i/Weinreb aj;id Crosby's trace M, of their eqn. (10) would correspond to trace 
Xz2x'(xrx' + nX*I)~^ The present criteria is likely to be less sensitive 
to mi sspecifi cation of r. 


Eigenvalue 
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where 


yx* ** xpx* » xrx'(xzx' + nX*I)-i y 


(estimated signal) 


ex* «* nX(XEX' nX*I)“i y, (estimated noise) 

and 

n * trace I “ trace X2X'(XEX' + nX*I)“^ (d.f. for signal) 

+ trace nX*(XSX* + nX*I)“i (d.f for noise). 

It is necessary, of course, that the X* used provides a good partition of y 
into signal and noise for this definition to be valid. It is clear that one 
wants as many eigenvalues as possible to be large compared to nX*. One can 
make a loose association of the d.f. for signal with the "effective rank," 
Thus, Xj is to be preferred to X 2 if 


d.f. signal (Xi,Xi*) > d.f. signal (X 2 ,X 2 *). (15) 

We have deliberately allowed Xj* and X 2 * to be different, and not necessarily 
equal to oVb, since in practice, as well as in Monte Carlo experiments with a 
small number of examples, the optimum X may depend on X as well as and the 
noise in the system. 

The GCV estimate X of X is the minimi zer of 

|y..XZX'(XEX'+nXI)-lyl|2 

n 

V(x) 

1 2 
[- Trace(I"Xi:X'(XEX'H-nXI)-i)] 
n 

and can be obtained as part of a realistic Monte Carlo study. Of course, it 
is quite possible that the eigensequence plots will show that the choice 
between Xi and X 2 on the basis of d.f. signal is insensitive to the choice of 

X. 


Eigenvalues of symmetric nonnegative definite matrices of dimension up to 
several hundred can be computed using double precision EISPACK (Smith et al . 
(1976)), If Z is diagonal, it may be cheaper and more accurate to compute the 
singular values of SV^X' using the singular valu^ decomposition in UNPACK 
(Dongarra et al . (1979)), The eigenvalues of XEX are the squares of the 
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singular values of , Approximate information concerning very 

much larger matrices may be obtained using the truncated singular value 
decomposition in Bates and Wahba (1982), It is conjecturec that eigensequence 
plots comparing different satellite scanning designs will show that, e.g, 
combining side looking scans from successive passes of a satellite along with 
data in the plane of the orbit (as suggested by Suomi (1983)) would have 
highly desirable properties. 

We close with ^ ■^ew remarks. Quadrature error, in e.g. evaluating the 
x-fj in (5) can be surprisingly important in ill posed problems and should not 
be treated cavalierly, either at the design stage or at the data analysis 
stage. This point is discussed in some detail in NWGP, where the use of 
matched quadrature for ill posed problems is discussed, Eigensequence plots 
obtained via inaccurate quadrature may present a different appearance than 
•those from a highly accurate quadrature, and a poor quadrature procedure or 
unrealistic value of X may mask differences between systems. 
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Appendix A, Proofs. 


Proof of (12). 

Letting be the symmetric square root of Q, 

Eb'Mj'QMiS “ 

= b Trace Mi'QMiS 
= b trace 'q^/^ 

= b Trace qV2{z - 2J:X'(XZX' + nXI)-i U 

+ '£X'(XEX' + nXI)-i XZX' (Xrx' + nXI)“^XS}QV2 (A.l) 

E e'M2'QM2o 

= Trace M 2 'QM 2 
= (j2 jpace M2M2 'q^/^ 

= a2 Trace qV^ {EX'(Xrx' + nXI)“2xi:} (A.2) 

Using 

(XEX' + nXI)-i XZX'(XZX' + nXI)-i 

= (XZX' + nXI)-i - nX(XEX' + nXI)-2 
and adding (A.l) and (A.2) gives, 
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Trace {b[I - ri/2x(XEX' + nXIj-iXuV'Z] 

+ (o2- nXb)El/2x(XZX' + nXI)-*2xEi/2}si/2Ql/2 

which gives (12). 

Proof of Theorem 

Suppose that BXi'XiB* > $X 2 'X 2 S' for any B, We will show that this 
implies that 

Trace qV2eXi' (X iEXi ' + nXI)-i XiEQ^/^ 

> Trace Q ^ 22 X 2 ' (X 2 Z:X 2 ' + nXI)- ^X 2 J:Q^/^ (A.6) 

for any Q and X. 

We will assume that l is nonsingular. Then our hypotheses imply that • 
3 'z1/2Xi’XieV2b > p'j:1/2x2'x 2J:V2 g for any 3, in other words, 

e1/2xi'xieV2,^i:1/2x2'X2EV2, 

where A )5tB means A-B is nonnegative definite. 

Let A = B = 1^2x2, where A and B are p x n. 
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OF poo^1 Q 


We have to show that 

A a'^B b' => A(a'a + nXI)"l a' ^ B(b'B + nAO-lB' / 

We first show that A(A*A + nXI)“^ A* = AA'(AA' + nXI)~^. 

This is equivalent to showing 

A(a'a + nXI)-i a' (AA' + nXI) = AA' » 

Expanding the U'.l't hand side gives 

A(a'a + nXI)“i (a'a) a' + nXA(A'A + nXI)“i a' 

= A(A'a + nXI)-i (a'a + nXI) a' - nXA(A'A + nXI)“i a' + nXA(A'A + nXI)-iA' 

= aa' 

Now, let (AA* + nXI) = C and (Bb' + nXI) = D. 

Therefore 

aa' (AA' + nXI)-^ = (C - nXI)C-i = I - nXC"l 
bb' (bb' + nXI) = (D - nXI)D“i = I - nXt>“l 

NowAA'<^BB' => C ^ D, and C => D"^ (See, e.g. Marshall and 

Olkin, p. 464), v/hich in turn implies that I - nXC-^^I - nXD“^, so the proof 
is finished. 
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Appendix B. Remarks on the specification of aVnb. 

Suppose that T'^ has an infinite series expansion in the Linder the 

2 

assumption that E6. = j » J “ 1,2,..“, and (for mathematical convenience 
J 

only) E3j3k = 0, j k, then for each p = 1,2 

1 1 P j 

-Es'(p) E-‘(p)e(p) “ - 2 — =b, (B.l) 

p p CTjj 

where P(p) and E(p) are the first p and p x p components of 6 and E 
respectively, A different modelling assumption is, that T° has the property 

A. U ^ 4. 

uriG u 

00 0,j‘2 

Z ~< « c (B.2) 

j = l Ojj 

Under this assumption 3\ of (7) is still an appropriate estimate of the first 
p components of 3, for appropriately chosen A. (See, e.g. Wahba (1977a)), 
but 

- 3'(p) 3 (p) 0 

P 

as p ->■ “ so that b is not readily defined independent of p. GCV will return a 
good estimate of X under either assumption (B.l) or (B.2) (see Wahba 1977b) 


and the design criteria resulting from the assumptions of this paper (i.e. 
assumption B.l) appear eminently plausible even if (B.2) is true. A related 
but somewhat harder to study design criteria under assumption (B.2) appears in 
Wahba (1978b). 


